Research Questions
Hypotheses
Descriptive Statistics
Primary Analysis
note
Analysis is awaiting Experian appending.
This analysis summaries the Nautilus customer data from the Frog
Hollow and Howell community solar farm participation from January 2020
to April 2022. The goal of the analysis is to first describe the
characteristics of the residents, the prevalence of default and/or churn
rates, the prevalence of churn rates, and any statistically significant
differences between groups in their default, late payments or churn
rates.
The data is comprised of monthly payment performance and, where
available, demographic data for residents. 32,385 monthly observations
over 813 homes, 812 accounts and 621 users were initially observed. To
avoid data quality issues and to analyze any potential churn and default
trends, for Frog Hollow households the analysis will drop values in
October 2021 until April 2022. This narrowed dataset is comprised of
31,704 observations; the homes, accounts and users did not change.
The first ten rows of the data are presented, grouped by account
number. Unique identifiers were replaced with sequential IDs.
- Tenure: length in months each account was active
- Payment_Method: either card or ACH/direct debit method
- temp_solar: annual (right?) solar allocation in kWh
- Churn_Tag: binary _var_iable capturing if an account left the solar
farm
- defauted: binary variable capturing if account holder defaulted on
payment
- Income: Low (<$50,000) or High (>$50,000)
- Solar_Farm: Either Howell or Frog Hollow
First 10 Home Rows
|
Account_ID
|
tenure
|
payment_method
|
temp_solar
|
Churn_Tag
|
defaulted
|
Income
|
solar_farm
|
churn_reason
|
|
1
|
17
|
card
|
25.68
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
2
|
22
|
card
|
6.65
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
3
|
18
|
card
|
0.96
|
1
|
0
|
NA
|
Howell (O&R)
|
Billing Complaint
|
|
4
|
24
|
card
|
83.40
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
5
|
25
|
card
|
6.05
|
0
|
0
|
High
|
Howell (O&R)
|
0
|
|
6
|
23
|
ach
|
14.54
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
7
|
23
|
ach
|
30.65
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
8
|
3
|
NA
|
24.87
|
0
|
0
|
High
|
Frog Hollow (CHGE)
|
0
|
|
9
|
25
|
card
|
7.92
|
0
|
0
|
NA
|
Howell (O&R)
|
0
|
|
10
|
23
|
card
|
11.20
|
0
|
0
|
High
|
Howell (O&R)
|
0
|
Descriptive Statistics
Total
The summary statistics below provide additional information. The
average tenure of the 812 accounts was 20.3 months. The churn rate was
12.4%. The sample does skew towards high income homes, however note a
large number of missing values were observed in income (567, 70%).
Descriptive Statistics
|
|
Total
|
Mean
|
Max
|
Min
|
|
Tenure
|
16486.00
|
20.30
|
25.0
|
1
|
|
Defaults
|
4.00
|
0.01
|
1.0
|
0
|
|
Churn
|
101.00
|
0.14
|
1.0
|
0
|
|
Low Income
|
21.00
|
0.09
|
NA
|
NA
|
|
High Income
|
224.00
|
0.91
|
NA
|
NA
|
|
Solar
|
6294.78
|
7.75
|
83.4
|
0
|
|
Payment by Card
|
632.00
|
0.79
|
NA
|
NA
|
|
Payment by ACH
|
172.00
|
0.21
|
NA
|
NA
|
Frog Hollow
Frog Hollow Descriptive Statistics
|
|
Total
|
Mean
|
Max
|
Min
|
|
Tenure
|
8799.00
|
19.38
|
22.00
|
2
|
|
Defaults
|
1.00
|
0.00
|
1.00
|
0
|
|
Churn
|
53.00
|
0.14
|
1.00
|
0
|
|
Low Income
|
9.00
|
0.21
|
NA
|
NA
|
|
High Income
|
33.00
|
0.79
|
NA
|
NA
|
|
Solar
|
3269.62
|
7.20
|
48.14
|
0
|
|
Payment by Card
|
376.00
|
0.84
|
NA
|
NA
|
|
Payment by ACH
|
71.00
|
0.16
|
NA
|
NA
|
Howell
Howell Descriptive Statistics
|
|
Total
|
Mean
|
Max
|
Min
|
|
Tenure
|
7687.00
|
21.47
|
25.0
|
1
|
|
Defaults
|
3.00
|
0.01
|
1.0
|
0
|
|
Churn
|
48.00
|
0.15
|
1.0
|
0
|
|
Low Income
|
12.00
|
0.06
|
NA
|
NA
|
|
High Income
|
191.00
|
0.94
|
NA
|
NA
|
|
Solar
|
3025.16
|
8.45
|
83.4
|
0
|
|
Payment by Card
|
256.00
|
0.72
|
NA
|
NA
|
|
Payment by ACH
|
101.00
|
0.28
|
NA
|
NA
|
Tenure Length
Obersvations are next grouped by tenure length.
Account Tenure Length Statistics
|
|
Number of homes
|
Churned
|
Defaulted
|
Average Solar Allocation
|
Payment by Card
|
Payment by ACH
|
|
≤ 5
|
29
|
12
|
0
|
5.01
|
18
|
3
|
|
5 to 10
|
42
|
22
|
0
|
7.27
|
35
|
7
|
|
10 to 15
|
23
|
10
|
3
|
7.66
|
22
|
1
|
|
15 to 20
|
718
|
57
|
1
|
7.89
|
557
|
161
|
|
Total
|
812
|
101
|
4
|
7.75
|
632
|
172
|
For homes that churned, the tenure was reasonbly much shorter than
those that did not churn. The below graph shows the average tenure
between these two groups, showing that it is almost double for those
homes that exhibit continuous subscription.
Income
Resident Income Statistics
|
Income
|
Number of accounts
|
Churned
|
Defaulted
|
Average Solar Allocation
|
Payment by Card
|
Payment by ACH
|
|
High
|
224
|
46
|
3
|
6.56
|
191
|
29
|
|
Low
|
21
|
7
|
1
|
4.54
|
18
|
2
|
|
NA
|
567
|
48
|
0
|
8.34
|
423
|
141
|
|
Total
|
812
|
101
|
4
|
7.75
|
632
|
172
|
Churn Reasons
When a respondent left the program, a reason was recorded, when
available. The below summarises the accounts that experienced a churn,
which may further be collapsed into categories. Note the total below may
not match the churn totals in the above descriptive statistics due to
defaulted payments and duplicates included.
|
Churn Reasons
|
Total
|
High Income
|
Low Income
|
No Income
|
|
Moving out of service territory
|
39
|
18
|
5
|
16
|
|
No Longer Interested
|
35
|
21
|
2
|
12
|
|
Billing Complaint
|
12
|
2
|
1
|
9
|
|
Ineligible Meter
|
7
|
1
|
0
|
6
|
|
Defaulted Payment
|
6
|
3
|
1
|
2
|
|
Duplicate
|
6
|
3
|
0
|
3
|
|
Got Rooftop
|
5
|
3
|
0
|
2
|
|
Rejected by Utility - Initial Submission
|
4
|
0
|
0
|
4
|
|
Health Issues
|
3
|
3
|
0
|
0
|
|
Rejected by Utility - Subsequent Submission
|
1
|
0
|
0
|
1
|
|
Total
|
118
|
54
|
9
|
55
|
temp <- raw %>%
group_by(user_id,utility_acct_number) %>%
summarise(tenure = length(unique(date_concat)),
count_accts = n(),
default = sum(default_tag=="Defaulted Payment"),
# count_accts = length(unique(utility_acct_number)) ,
count_churn = sum(default_tag !=0 & default_tag !="Defaulted Payment")/length(unique(date_concat))) #%>%
## `summarise()` has grouped output by 'user_id'. You can override using the
## `.groups` argument.
# filter(count_accts > 1)
temp
## # A tibble: 812 × 6
## # Groups: user_id [620]
## user_id utility_acct_number tenure count_accts default count_churn
## <dbl> <chr> <int> <int> <int> <dbl>
## 1 17 1769073011 23 23 0 0
## 2 19 7989248005 3 3 0 1
## 3 31 689005016 21 44 0 2.10
## 4 32 21002383038 22 23 0 0
## 5 33 2155441000 15 30 0 2
## 6 33 2536130080 17 34 0 2
## 7 37 21003282874 22 46 0 0
## 8 38 21001139837 22 69 0 0
## 9 39 7376061001 22 44 0 0
## 10 42 4381440011 15 15 0 1
## # … with 802 more rows
Model
A logit model is deployed to analyze the probability of a home
churning. The model in brief is described below. Without additional data
collected from Experian, limited variability can be measured due to NA
values in addition to limited number of coefficients to measure.
In the limited data available, statistically significant effects were
measured for probability of churn for both tenure (Months) and days
late. Due to relatively small number of observations of late payments,
no statistically significant results were measured in probability of
late or default payments.
\[\begin{equation}
P_{d} = \beta_{0} + \beta_{1}*Months +\beta_{2}*(Payment Method = Card)
+\beta_{3}*Income=Low \\ +\beta_{4}*Solar kwH + \beta{5}*(Solar
Farm=Howell)
\end{equation}\]
##
## Call:
## glm(formula = Churn_Tag ~ tenure + payment_method + Income +
## temp_solar + solar_farm, family = binomial(link = "logit"),
## data = joined)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.5611 -0.4707 -0.4147 -0.2964 2.2310
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.10760 1.07719 3.813 0.000137 ***
## tenure -0.21760 0.03860 -5.637 0.0000000173 ***
## payment_methodcard 0.01136 0.72856 0.016 0.987562
## IncomeLow 0.18221 0.66940 0.272 0.785467
## temp_solar -0.04837 0.05437 -0.890 0.373615
## solar_farmHowell (O&R) -1.08611 0.51024 -2.129 0.033286 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 242.38 on 218 degrees of freedom
## Residual deviance: 165.86 on 213 degrees of freedom
## (593 observations deleted due to missingness)
## AIC: 177.86
##
## Number of Fisher Scoring iterations: 5
The logit curves below show the relationship between the distribution
of both tenure and days late with the probability of churning. The first
graph shows that the longer the tenure, the lower the likelihood. The
second graph shows that the days of late payments are generally not
associated with likelihood of churn.

